34 research outputs found

    Why is unsupervised alignment of English embeddings from different algorithms so hard?

    Full text link
    This paper presents a challenge to the community: Generative adversarial networks (GANs) can perfectly align independent English word embeddings induced using the same algorithm, based on distributional information alone; but fails to do so, for two different embeddings algorithms. Why is that? We believe understanding why, is key to understand both modern word embedding algorithms and the limitations and instability dynamics of GANs. This paper shows that (a) in all these cases, where alignment fails, there exists a linear transform between the two embeddings (so algorithm biases do not lead to non-linear differences), and (b) similar effects can not easily be obtained by varying hyper-parameters. One plausible suggestion based on our initial experiments is that the differences in the inductive biases of the embedding algorithms lead to an optimization landscape that is riddled with local optima, leading to a very small basin of convergence, but we present this more as a challenge paper than a technical contribution.Comment: Accepted at EMNLP 201

    Code-Switching as Strategically Employed in Political Discourse

    Get PDF
    There is extensive scholarship in the field of sociolinguistics on mediated political discourse as strategically employed to gain support in the run-up to and during elections. Among other things, this work reveals that the rhetorical success of politicians greatly depends on their ability to get the right balance between the expression of authority and solidarity in their speech performances. The use of code-switching in achieving such balance has been touched upon in some case studies but never studied in depth. I analyse the speech of Boyko Borisov, now Prime Minister of Bulgaria (and at the time of recording, a candidate for the position), in the framework of Bell’s (1984) audience and referee design theory, with reference to Myers Scotton and Ury’s (1977) views on code-switching. Borisov is found to employ two codes, a standard and a nonstandard one, characteristic of two different personae of his: the authoritative politician and the folksy, regular person. Depending on the situation, he chooses to act out either just one of these personae or both of them by switching between the two codes, thus maintaining the aforementioned vital balance between the expression of power and solidarity. The analysis reveals that the switches occur at specific points in the conversation, in line with existing theory on metaphorical code-switching, confirming that they are strategic in nature rather than random or accidental

    Copenhagen at CoNLL--SIGMORPHON 2018: Multilingual Inflection in Context with Explicit Morphosyntactic Decoding

    Full text link
    This paper documents the Team Copenhagen system which placed first in the CoNLL--SIGMORPHON 2018 shared task on universal morphological reinflection, Task 2 with an overall accuracy of 49.87. Task 2 focuses on morphological inflection in context: generating an inflected word form, given the lemma of the word and the context it occurs in. Previous SIGMORPHON shared tasks have focused on context-agnostic inflection---the "inflection in context" task was introduced this year. We approach this with an encoder-decoder architecture over character sequences with three core innovations, all contributing to an improvement in performance: (1) a wide context window; (2) a multi-task learning approach with the auxiliary task of MSD prediction; (3) training models in a multilingual fashion

    A Probabilistic Generative Model of Linguistic Typology

    Get PDF
    In the principles-and-parameters framework, the structural features of languages depend on parameters that may be toggled on or off, with a single parameter often dictating the status of multiple features. The implied covariance between features inspires our probabilisation of this line of linguistic inquiry---we develop a generative model of language based on exponential-family matrix factorisation. By modelling all languages and features within the same architecture, we show how structural similarities between languages can be exploited to predict typological features with near-perfect accuracy, outperforming several baselines on the task of predicting held-out features. Furthermore, we show that language embeddings pre-trained on monolingual text allow for generalisation to unobserved languages. This finding has clear practical and also theoretical implications: the results confirm what linguists have hypothesised, i.e.~that there are significant correlations between typological features and languages.Comment: NAACL 2019, 12 page

    <i>Indicatements </i>that character language models learn English morpho-syntactic units and regularities

    Get PDF
    Character language models have access to surface morphological patterns, but it is not clear whether or how they learn abstract morphological regularities. We instrument a character language model with several probes, finding that it can develop a specific unit to identify word boundaries and, by extension, morpheme boundaries, which allows it to capture linguistic properties and regularities of these units. Our language model proves surprisingly good at identifying the selectional restrictions of English derivational morphemes, a task that requires both morphological and syntactic awareness. Thus we conclude that, when morphemes overlap extensively with the words of a language, a character language model can perform morphological abstraction
    corecore